-
Notifications
You must be signed in to change notification settings - Fork 161
chore(worker,web): Repo indexing stability improvements + perf improvements to web #563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR performs a comprehensive refactor of the repository indexing system, UI architecture, and job orchestration. It replaces BullMQ-based RepoManager with a groupmq-based RepoIndexManager, restructures the database schema to use discrete job records, refactors Git operations to support cancellation, and significantly reorganizes web UI components including removal of connection management pages and restructuring of the homepage, search, and chat interfaces. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Web as Web Frontend
participant SearchPage as Search Page
participant LandingPage as SearchLandingPage
participant ResultsPage as SearchResultsPage
participant Backend as Backend API
User->>Web: Navigate to /search
Web->>SearchPage: Render with domain & query params
alt Query is empty
SearchPage->>LandingPage: Render landing with repos & stats
LandingPage->>Backend: Fetch recommended repos & stats
Backend-->>LandingPage: Return repos with indexedAt
LandingPage->>User: Show carousel & search guide
else Query provided
SearchPage->>ResultsPage: Render with query
ResultsPage->>Backend: Execute search (React Query)
Backend-->>ResultsPage: Return results with duration/stats
ResultsPage->>User: Show results panel + filters + preview
User->>ResultsPage: Toggle filter panel (mod+b)
User->>ResultsPage: Select file/match for preview
ResultsPage->>User: Update code preview panel
end
sequenceDiagram
participant Backend as Backend
participant RepoIndexManager as RepoIndexManager
participant GroupMQ as GroupMQ Queue
participant Worker as Job Worker
participant Git as Git Operations
Backend->>RepoIndexManager: new RepoIndexManager(db, settings, redis)
Backend->>RepoIndexManager: startScheduler()
RepoIndexManager->>RepoIndexManager: scheduleIndexJobs() (periodic)
RepoIndexManager->>GroupMQ: createJobs({type: INDEX, repoId, ...})
RepoIndexManager->>Worker: Start worker with concurrency
loop Job Processing
GroupMQ-->>Worker: Dequeue job
Worker->>Git: cloneRepository(signal) or fetchRepository(signal)
Git-->>Worker: Repo ready
Worker->>Git: upsertGitConfig({path, gitConfig, signal})
Worker->>Git: indexGitRepository(signal)
Worker->>RepoIndexManager: Job complete
RepoIndexManager->>RepoIndexManager: onJobCompleted (update indexedAt)
end
Note over Worker,Git: AbortSignal enables cancellation throughout
Estimated code review effort🎯 5 (Critical) | ⏱️ 90+ minutes This PR involves substantial, intricate changes spanning multiple architectural layers: replacement of the entire job orchestration system (BullMQ → groupmq), database schema migration with breaking changes, comprehensive UI restructuring (homepage removal, search/chat rewrite, connections page removal), and backend git operation refactoring for cancellation support. The heterogeneous nature of changes—touching nearly every layer of both backend and frontend—and the high logic density in new systems like RepoIndexManager demand separate reasoning for each major subsystem. Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This comment has been minimized.
This comment has been minimized.
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 24
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (12)
packages/web/src/app/[domain]/components/searchModeSelector.tsx (1)
18-18
: Fix typo in TODO comment.The comment contains
@tood
instead of@todo
.Apply this diff:
-// @tood: point this to the actual docs page +// @todo: point this to the actual docs pagepackages/web/src/features/chat/components/chatThread/chatThread.tsx (1)
131-136
: Pass both provider and model identifiers to ensure correct model matching.The code passes only
selectedLanguageModel.model
togenerateAndUpdateChatNameFromMessage
, but the lookup inactions.ts:197
searches all configured models using only the model field. If different providers offer models with identical names, this could match the wrong provider's configuration.Since
LanguageModelInfo
maintainsprovider
andmodel
as separate fields (not a combined identifier), the lookup should verify both to ensure uniqueness:
- chatThread.tsx line 134: Pass
{ provider: selectedLanguageModel.provider, model: selectedLanguageModel.model }
or use a combined identifier- actions.ts line 197: Match using both provider and model:
.find((model) => model.provider === providerId && model.model === modelId)
packages/web/src/components/ui/navigation-menu.tsx (1)
1-6
: Mark this shadcn/Radix UI wrapper as a Client Component.Radix primitives require client execution. Without "use client", importing this from a Server Component will error at build/runtime.
Apply:
+ "use client"; + import * as React from "react" import * as NavigationMenuPrimitive from "@radix-ui/react-navigation-menu"packages/web/src/app/[domain]/layout.tsx (1)
27-35
: Fix Next.js App Routerlayout
params typing and remove unnecessary await.
params
is not a Promise. Current typing/await is misleading and can confuse tooling.-interface LayoutProps { - children: React.ReactNode, - params: Promise<{ domain: string }> -} +interface LayoutProps { + children: React.ReactNode; + params: { domain: string }; +} ... -export default async function Layout(props: LayoutProps) { - const params = await props.params; +export default async function Layout(props: LayoutProps) { + const params = props.params;packages/web/src/initialize.ts (1)
129-139
: Prisma: findUnique with extra filter is invalid and will not compile/run.findUnique only accepts a unique selector; adding role filter makes it non-unique. Use findFirst with combined conditions, or fetch by unique key then check role.
Apply one of these fixes:
- const guestUser = await prisma.userToOrg.findUnique({ - where: { - orgId_userId: { - orgId: SINGLE_TENANT_ORG_ID, - userId: SOURCEBOT_GUEST_USER_ID, - }, - role: { - not: OrgRole.GUEST, - } - }, - }); + const guestUser = await prisma.userToOrg.findFirst({ + where: { + orgId: SINGLE_TENANT_ORG_ID, + userId: SOURCEBOT_GUEST_USER_ID, + role: { not: OrgRole.GUEST }, + }, + });Alternatively:
- const guestUser = await prisma.userToOrg.findUnique({ where: { orgId_userId: { orgId: SINGLE_TENANT_ORG_ID, userId: SOURCEBOT_GUEST_USER_ID }, role: { not: OrgRole.GUEST } }}); + const edge = await prisma.userToOrg.findUnique({ where: { orgId_userId: { orgId: SINGLE_TENANT_ORG_ID, userId: SOURCEBOT_GUEST_USER_ID } }}); + const guestUser = edge && edge.role !== OrgRole.GUEST ? edge : null;packages/web/src/app/[domain]/chat/components/tutorialDialog.tsx (1)
139-141
: Typo: “Reposet”.Consider “Repo set” or “Repository set” in the UI copy.
packages/web/src/actions.ts (2)
537-564
: Auth bypass: user-supplied where can override orgId.Spreading ...where after orgId lets callers query other orgs. Force orgId via AND.
-export const getRepos = async ({ - where, - take, -}: { - where?: Prisma.RepoWhereInput, - take?: number -} = {}) => sew(() => +export const getRepos = async ({ + where, + take, +}: { + where?: Prisma.RepoWhereInput, + take?: number +} = {}) => sew(() => withOptionalAuthV2(async ({ org, prisma }) => { - const repos = await prisma.repo.findMany({ - where: { - orgId: org.id, - ...where, - }, - take, - }); + const whereClause: Prisma.RepoWhereInput = + where ? { AND: [ where, { orgId: org.id } ] } : { orgId: org.id }; + const MAX = 200; + const repos = await prisma.repo.findMany({ + where: whereClause, + take: take ? Math.min(Math.max(take, 1), MAX) : undefined, + orderBy: { id: 'desc' }, + }); return repos.map((repo) => repositoryQuerySchema.parse({ /* unchanged */ })) }));
1731-1766
: Prevent token/header leakage on image proxy (SSRF guard).Authorization headers are added before validating the image URL’s origin. A malicious imageUrl could exfiltrate tokens. Only attach auth when the image host matches the expected provider host; otherwise send no credentials.
- const authHeaders: Record<string, string> = {}; - for (const { connection } of repo.connections) { + const authHeaders: Record<string, string> = {}; + const imageUrlObj = new URL(repo.imageUrl); + for (const { connection } of repo.connections) { try { if (connection.connectionType === 'github') { - const config = connection.config as unknown as GithubConnectionConfig; - if (config.token) { - const token = await getTokenFromConfig(config.token, connection.orgId, prisma); - authHeaders['Authorization'] = `token ${token}`; - break; - } + // GitHub avatars are public; do NOT forward tokens to third-party origins. + // Intentionally no auth header here. } else if (connection.connectionType === 'gitlab') { const config = connection.config as unknown as GitlabConnectionConfig; if (config.token) { const token = await getTokenFromConfig(config.token, connection.orgId, prisma); - authHeaders['PRIVATE-TOKEN'] = token; - break; + const expectedHost = new URL(config.url ?? 'https://gitlab.com').hostname; + if (imageUrlObj.hostname === expectedHost) { + authHeaders['PRIVATE-TOKEN'] = token; + } + break; } } else if (connection.connectionType === 'gitea') { - const config = connection.config as unknown as GiteaConnectionConfig; - if (config.token) { - const token = await getTokenFromConfig(config.token, connection.orgId, prisma); - authHeaders['Authorization'] = `token ${token}`; - break; - } + // Gitea avatars are typically public; avoid forwarding tokens. } } catch (error) { logger.warn(`Failed to get token for connection ${connection.id}:`, error); } }Optionally add a short timeout to fetch using AbortController to avoid hanging requests.
packages/web/src/app/[domain]/components/pathHeader.tsx (1)
258-274
: Fix nested interactive elements in dropdown (Link inside DropdownMenuItem)Current structure nests an interactive DropdownMenuItem inside an anchor, which harms a11y and can impede keyboard selection. Use Radix’s asChild to make Link the actionable element.
Apply this diff:
- {hiddenSegments.map((segment) => ( - <Link - href={getBrowsePath({ - repoName: repo.name, - path: segment.fullPath, - pathType: segment.isLastSegment ? pathType : 'tree', - revisionName: branchDisplayName, - domain, - })} - className="font-mono text-sm hover:cursor cursor-pointer" - key={segment.fullPath} - > - <DropdownMenuItem className="hover:cursor cursor-pointer"> - {renderSegmentWithHighlight(segment)} - </DropdownMenuItem> - </Link> - ))} + {hiddenSegments.map((segment) => ( + <DropdownMenuItem key={segment.fullPath} asChild> + <Link + href={getBrowsePath({ + repoName: repo.name, + path: segment.fullPath, + pathType: segment.isLastSegment ? pathType : 'tree', + revisionName: branchDisplayName, + domain, + })} + className="font-mono text-sm" + > + {renderSegmentWithHighlight(segment)} + </Link> + </DropdownMenuItem> + ))}packages/backend/src/git.ts (3)
61-78
: Clone CWD/destination mismatch: this will fail or clone into a nested path.You set cwd to path and then call git.clone(..., path). That attempts to clone inside the target directory or fails if it exists. For a bare clone with cwd=path, clone into "." and ensure the directory is empty.
Apply these diffs:
- import { mkdir } from 'node:fs/promises'; + import { mkdir, readdir } from 'node:fs/promises';- await mkdir(path, { recursive: true }); - - const git = createGitClientForPath(path, onProgress, signal); + await mkdir(path, { recursive: true }); + const contents = await readdir(path); + if (contents.length > 0) { + throw new Error(`Clone destination exists and is not empty: ${path}`); + } + const git = createGitClientForPath(path, onProgress, signal);- await git.clone(cloneUrl, path, cloneArgs); + // Clone into the current (empty) directory + await git.clone(cloneUrl, ".", cloneArgs);
108-139
: Cleanup can override the primary error and misses env/signal; use helper and unset-all.
- finally’s git.raw can throw and mask the fetch error.
- Use the same helper (ceiling + signal), scope to local, and remove all extraHeader entries.
Apply this diff:
- } finally { - if (authHeader) { - const git = simpleGit({ - progress: onProgress, - }).cwd({ - path: path, - }) - - await git.raw(["config", "--unset", "http.extraHeader", authHeader]); - } - } + } finally { + if (authHeader) { + try { + const git = createGitClientForPath(path, onProgress, signal); + await git.raw(["config", "--local", "--unset-all", "http.extraHeader"]); + } catch { + // best-effort cleanup; do not override the original error + } + } + }
192-203
: Config scope bug; prefer --local and --unset-all, and don’t rely on listConfig(all).listConfig aggregates all scopes; you may try to unset keys that only exist globally and trigger errors. Unset locally and ignore “not set” errors. Also handle multi-valued keys.
Apply this diff:
- try { - const configList = await git.listConfig(); - const setKeys = Object.keys(configList.all); - - for (const key of keys) { - if (setKeys.includes(key)) { - await git.raw(['config', '--unset', key]); - } - } - } catch (error: unknown) { + try { + for (const key of keys) { + try { + await git.raw(['config', '--local', '--unset-all', key]); + } catch { + // ignore if not set locally + } + } + } catch (error: unknown) {
🧹 Nitpick comments (51)
packages/web/src/lib/utils.ts (2)
379-381
: Enhance JSDoc with examples and edge case documentation.The comment "Converts a number to a string" is minimal. Consider documenting the formatting behavior and providing examples.
Apply this diff to improve the documentation:
/** - * Converts a number to a string + * Converts a number to a shortened display string with SI suffixes. + * - Numbers < 1,000 are returned as-is (e.g., 42 → "42") + * - Numbers < 1,000,000 are displayed in thousands (e.g., 1,500 → "1.5k") + * - Numbers ≥ 1,000,000 are displayed in millions (e.g., 2,300,000 → "2.3m") + * + * @param number The number to format + * @returns The formatted string with appropriate suffix */
382-390
: Consider adding input validation for edge cases.The function doesn't validate for
NaN
,Infinity
, or negative numbers. While unlikely in the context of displaying counts/progress, defensive validation would prevent unexpected output if corrupted data is passed.Apply this diff to add validation:
export const getShortenedNumberDisplayString = (number: number) => { + if (!Number.isFinite(number)) { + return '0'; + } + if (number < 0) { + return '0'; + } if (number < 1000) { return number.toString(); } else if (number < 1000000) { return `${(number / 1000).toFixed(1)}k`; } else { return `${(number / 1000000).toFixed(1)}m`; } }packages/web/src/app/[domain]/components/searchModeSelector.tsx (1)
31-31
: Consider syncingfocusedSearchMode
with thesearchMode
prop.The
focusedSearchMode
state is initialized from thesearchMode
prop but won't update if the parent changes the prop value. This could cause the tooltip to display for the wrong mode.Add a
useEffect
to keep them in sync:const [focusedSearchMode, setFocusedSearchMode] = useState<SearchMode>(searchMode); + +useEffect(() => { + setFocusedSearchMode(searchMode); +}, [searchMode]);Don't forget to import
useEffect
:-import { useCallback, useState } from "react"; +import { useCallback, useState, useEffect } from "react";packages/web/src/app/[domain]/search/components/searchLandingPage.tsx (1)
20-31
: Parallelize data fetches to cut TTFB.
Currently sequential; use Promise.all.- const carouselRepos = await getRepos({ + const [carouselRepos, repoStats] = await Promise.all([ + getRepos({ where: { indexedAt: { not: null }, }, take: 10, - }); - - const repoStats = await getReposStats(); + }), + getReposStats(), + ]);packages/web/src/app/[domain]/search/components/searchResultsPage.tsx (6)
323-329
: Use a button for “load more” (accessibility + semantics).
Divs aren’t keyboard/AT friendly.- <div - className="cursor-pointer text-blue-500 text-sm hover:underline ml-4" - onClick={onLoadMoreResults} - > - (load more) - </div> + <button + type="button" + className="text-blue-500 text-sm hover:underline ml-4" + onClick={onLoadMoreResults} + aria-label="Load more results" + > + (load more) + </button>
303-308
: Handle clipboard write asynchronously and robustly.
Avoid unhandled promise; return success/failure.- <CopyIconButton - onCopy={() => { - navigator.clipboard.writeText(JSON.stringify(searchStats, null, 2)); - return true; - }} - className="ml-auto" - /> + <CopyIconButton + onCopy={async () => { + try { + await navigator.clipboard.writeText(JSON.stringify(searchStats ?? null, null, 2)); + return true; + } catch { + return false; + } + }} + className="ml-auto" + />
213-215
: Scope filter-panel state per domain.
Current localStorage key is global; include domain to avoid cross-domain bleed.Option A (simpler): call useDomain inside PanelGroup.
const PanelGroup = ({ ... }: PanelGroupProps) => { + const domain = useDomain(); ... - const [isFilterPanelCollapsed, setIsFilterPanelCollapsed] = useLocalStorage('isFilterPanelCollapsed', false); + const [isFilterPanelCollapsed, setIsFilterPanelCollapsed] = + useLocalStorage(`isFilterPanelCollapsed:${domain}`, false);Option B: pass
domain
from SearchResultsPage to PanelGroup as a prop.
266-275
: Add accessible label to the filter toggle button.
Improves screen reader UX.- <Button + <Button variant="ghost" size="icon" className="h-8 w-8" + aria-label="Open filter panel" onClick={() => { filterPanelRef.current?.expand(); }} >
160-165
: Spinner: add role and live region.
Assistive tech will announce progress.- <div className="flex flex-col items-center justify-center h-full gap-2"> + <div className="flex flex-col items-center justify-center h-full gap-2" role="status" aria-live="polite">
74-78
: Optional: keep previous data to reduce UI jank on refetch.- refetchOnWindowFocus: false, - retry: false, - staleTime: 0, + refetchOnWindowFocus: false, + retry: false, + staleTime: 0, + keepPreviousData: true,packages/web/src/app/[domain]/components/repositoryCarousel.tsx (5)
80-87
: Use stable keys and remove redundant child key.Index keys cause reconciliation issues;
RepositoryBadge
doesn’t need its own key.- {displayRepos.map((repo, index) => ( - <CarouselItem key={index} className="basis-auto"> - <RepositoryBadge - key={index} - repo={repo} - /> - </CarouselItem> - ))} + {displayRepos.map((repo) => ( + <CarouselItem key={repo.repoName} className="basis-auto"> + <RepositoryBadge repo={repo} /> + </CarouselItem> + ))}
40-45
: External docs link: prefer<a target="_blank" rel="noopener noreferrer">
for consistency and safety.Elsewhere you already do this for docs. Align here too.
- <Link href={`https://docs.sourcebot.dev/docs/connections/overview`} className="text-blue-500 hover:underline inline-flex items-center gap-1"> - connection - </Link>{" "} + <a + href="https://docs.sourcebot.dev/docs/connections/overview" + target="_blank" + rel="noopener noreferrer" + className="text-primary hover:underline inline-flex items-center gap-1" + > + connection + </a>{" "}
150-155
: Prevent long repo names from overflowing.Add truncation and hidden overflow so badges don’t blow up layout.
- className={clsx("flex flex-row items-center gap-2 border rounded-md p-2 text-clip")} + className={clsx("flex flex-row items-center gap-2 border rounded-md p-2 overflow-hidden")} > {repoIcon} - <span className="text-sm font-mono"> + <span className="text-sm font-mono truncate"> {displayName} </span>
64-78
: Consider pausing auto-scroll on focus for keyboard users.Add
stopOnFocusIn: true
so the carousel doesn’t move while tabbing/interacting; keepstopOnMouseEnter
as-is. Based on learnings.Autoscroll({ startDelay: 0, speed: 1, stopOnMouseEnter: true, stopOnInteraction: false, + stopOnFocusIn: true, }),
141-156
: Perf: consider disabling prefetch on many repo links.If
displayRepos
is large, Next.js may prefetch many pages. You can opt out per-link.- <Link + <Link + prefetch={false} href={getBrowsePath({packages/web/src/app/[domain]/chat/components/demoCards.tsx (1)
14-20
: Consider renaming the interface to follow React conventions.The interface
DemoCards
has the same name as the component, which diverges from the typical React/TypeScript pattern of naming props interfacesComponentNameProps
.Apply this diff to align with React conventions:
-interface DemoCards { +interface DemoCardsProps { demoExamples: DemoExamples; } export const DemoCards = ({ demoExamples, -}: DemoCards) => { +}: DemoCardsProps) => {packages/web/src/features/chat/components/chatBox/chatBox.tsx (2)
294-301
: Disabled styles won’t apply to Editable; add aria-disabled + conditional classes.Tailwind’s disabled: variants won’t trigger here because Editable doesn’t get a disabled attribute. Use aria-disabled and explicit class toggles.
- <Editable - className="w-full focus-visible:outline-none focus-visible:ring-0 bg-background text-base disabled:cursor-not-allowed disabled:opacity-50 md:text-sm" + <Editable + aria-disabled={isDisabled} + className={cn( + "w-full focus-visible:outline-none focus-visible:ring-0 bg-background text-base md:text-sm", + isDisabled && "cursor-not-allowed opacity-50" + )} placeholder="Ask a question about your code. @mention files or select search scopes to refine your query." renderElement={renderElement} renderLeaf={renderLeaf} onKeyDown={onKeyDown} readOnly={isDisabled} />
83-86
: Guard focus and key handlers when disabled to avoid confusing UX.When isDisabled, prevent focusing via "/" and short‑circuit onKeyDown.
- useHotkeys("/", (e) => { - e.preventDefault(); - ReactEditor.focus(editor); - }); + useHotkeys( + "/", + (e) => { + if (isDisabled) return; + e.preventDefault(); + ReactEditor.focus(editor); + }, + [editor, isDisabled] + );- const onKeyDown = useCallback((event: KeyboardEvent<HTMLDivElement>) => { + const onKeyDown = useCallback((event: KeyboardEvent<HTMLDivElement>) => { + if (isDisabled) return; if (suggestionMode === "none") { ... - }, [suggestionMode, suggestions, onSubmit, index, onInsertSuggestion]); + }, [suggestionMode, suggestions, onSubmit, index, onInsertSuggestion, isDisabled]);Also applies to: 209-251, 37-49, 44-49
packages/web/src/features/chat/types.ts (1)
174-179
: Tighten schema to reduce drift with server model.If you have canonical provider/model enums or a server Zod schema, import/derive from that instead of z.string(). Otherwise, at least enforce non‑empty strings.
-export const languageModelInfoSchema = z.object({ - provider: z.string(), - model: z.string(), - displayName: z.string().optional(), -}); +export const languageModelInfoSchema = z.object({ + provider: z.string().min(1), + model: z.string().min(1), + displayName: z.string().optional(), +});packages/web/src/app/[domain]/chat/page.tsx (1)
24-38
: Fetch independent data in parallel to reduce TTFB.getConfiguredLanguageModelsInfo, getSearchContexts, getRepos, carousel getRepos, and getReposStats don’t depend on each other; run them concurrently.
- const languageModels = await getConfiguredLanguageModelsInfo(); - const searchContexts = await getSearchContexts(params.domain); - const allRepos = await getRepos(); - const carouselRepos = await getRepos({ - where: { indexedAt: { not: null } }, - take: 10, - }); - const repoStats = await getReposStats(); + const [ + languageModels, + searchContexts, + allRepos, + carouselRepos, + repoStats, + ] = await Promise.all([ + getConfiguredLanguageModelsInfo(), + getSearchContexts(params.domain), + getRepos(), + getRepos({ where: { indexedAt: { not: null } }, take: 10 }), + getReposStats(), + ]);Also applies to: 55-63
packages/web/src/features/chat/components/chatBox/languageModelSelector.tsx (2)
64-67
: De-duplication is O(n²); switch to a Map for linear-time.Improves perf for long model lists without changing behavior.
Apply:
- const languageModels = useMemo(() => { - return _languageModels.filter((model, selfIndex, selfArray) => - selfIndex === selfArray.findIndex((t) => getLanguageModelKey(t) === getLanguageModelKey(model)) - ); - }, [_languageModels]); + const languageModels = useMemo(() => { + const seen = new Map<string, LanguageModelInfo>(); + for (const m of _languageModels) { + const k = getLanguageModelKey(m); + if (!seen.has(k)) seen.set(k, m); + } + return Array.from(seen.values()); + }, [_languageModels]);
74-105
: Popover/Tooltip trigger stacking and a11y.
- Add aria-expanded/aria-controls on the Button for screen readers.
- Hide Tooltip while Popover is open to avoid dual overlays.
Apply:
- <Button - onClick={handleTogglePopover} + <Button + aria-haspopup="listbox" + aria-expanded={isPopoverOpen} + aria-controls="model-popover" + onClick={handleTogglePopover} className={cn( "flex p-1 rounded-md items-center justify-between bg-inherit h-6", className )} >- <TooltipContent side="bottom" className="p-0 border-0 bg-transparent shadow-none"> + {/* Avoid tooltip when popover is open */} + {!isPopoverOpen && ( + <TooltipContent side="bottom" className="p-0 border-0 bg-transparent shadow-none"> <LanguageModelInfoCard /> - </TooltipContent> + </TooltipContent> + )}- <PopoverContent + <PopoverContent + id="model-popover" className="w-auto p-0" align="start" onEscapeKeyDown={() => setIsPopoverOpen(false)} >Also applies to: 109-157
packages/web/src/app/[domain]/chat/components/landingPageChatBox.tsx (1)
14-18
: Rename props interface to avoid shadowing the component name.Improves readability and IDE hints.
Apply:
-interface LandingPageChatBox { +interface LandingPageChatBoxProps { languageModels: LanguageModelInfo[]; repos: RepositoryQuery[]; searchContexts: SearchContextQuery[]; } -export const LandingPageChatBox = ({ +export const LandingPageChatBox = ({ languageModels, repos, searchContexts, -}: LandingPageChatBox) => { +}: LandingPageChatBoxProps) => {Also applies to: 20-24
packages/web/src/features/chat/components/chatBox/searchScopeSelector.tsx (2)
189-203
: Auto-focus the search input when the popover opens.Speeds keyboard workflows and improves a11y.
Apply:
- // Measure virtualizer when popover opens and container is mounted + // Measure virtualizer when popover opens and container is mounted; focus input useEffect(() => { if (isOpen) { setIsMounted(true); setHighlightedIndex(0); // Give the DOM a tick to render before measuring requestAnimationFrame(() => { if (scrollContainerRef.current) { virtualizer.measure(); } + inputRef.current?.focus(); }); } else { setIsMounted(false); } }, [isOpen, virtualizer]);- <div className="flex items-center border-b px-3"> - <Input + <div className="flex items-center border-b px-3"> + <Input + ref={inputRef} placeholder="Search scopes..." value={searchQuery} onChange={(e) => setSearchQuery(e.target.value)} onKeyDown={handleInputKeyDown} className="border-0 focus-visible:ring-0 focus-visible:ring-offset-0 h-11" />Add at top with other refs:
- const [searchQuery, setSearchQuery] = useState(""); + const [searchQuery, setSearchQuery] = useState(""); + const inputRef = useRef<HTMLInputElement>(null);Also applies to: 264-270
229-236
: Add aria roles/attributes for better screen reader support.
- Button should expose combobox semantics.
- List container should be role="listbox" with option items and aria-activedescendant.
Apply:
- <Button + <Button + aria-haspopup="listbox" + aria-expanded={isOpen} + aria-controls="search-scope-popover" ref={ref} {...props} onClick={handleTogglePopover}- <PopoverContent + <PopoverContent + id="search-scope-popover" className="w-[400px] p-0" align="start" onEscapeKeyDown={() => onOpenChanged(false)} >- <div + <div ref={scrollContainerRef} - className="max-h-[300px] overflow-auto" + className="max-h-[300px] overflow-auto" + role="listbox" + aria-activedescendant={ + sortedSearchScopeItems.length > 0 + ? `scope-${sortedSearchScopeItems[highlightedIndex]?.item.type}-${sortedSearchScopeItems[highlightedIndex]?.item.value}` + : undefined + } >- <div - key={`${item.type}-${item.value}`} + <div + id={`scope-${item.type}-${item.value}`} + role="option" + aria-selected={isSelected} + key={`${item.type}-${item.value}`} onClick={() => toggleItem(item)}Also applies to: 257-263, 276-347, 297-344
packages/web/src/components/ui/navigation-menu.tsx (1)
43-45
: Remove duplicate "group" class on Trigger.
navigationMenuTriggerStyle()
already includesgroup
; passing"group"
again is redundant.- className={cn(navigationMenuTriggerStyle(), "group", className)} + className={cn(navigationMenuTriggerStyle(), className)}Also applies to: 53-55
packages/web/src/app/[domain]/components/navigationMenu/navigationItems.tsx (2)
34-41
: Use Next.js Link with Radix Link viaasChild
for prefetch and a11y.Radix
NavigationMenuLink
renders an<a>
; using Next<Link>
as child improves routing (prefetch) and avoids nested interactive elements elsewhere.-import { NavigationMenuItem, NavigationMenuLink, NavigationMenuList, navigationMenuTriggerStyle } from "@/components/ui/navigation-menu"; +import { NavigationMenuItem, NavigationMenuLink, NavigationMenuList, navigationMenuTriggerStyle } from "@/components/ui/navigation-menu"; +import Link from "next/link"; ... -<NavigationMenuLink - href={`/${domain}/search`} - className={cn(navigationMenuTriggerStyle(), "gap-2")} -> +<NavigationMenuLink asChild className={cn(navigationMenuTriggerStyle(), "gap-2")}> + <Link href={`/${domain}/search`}> <SearchIcon className="w-4 h-4 mr-1" /> Search -</NavigationMenuLink> + </Link> +</NavigationMenuLink> ... -<NavigationMenuLink href={`/${domain}/chat`} className={navigationMenuTriggerStyle()}> +<NavigationMenuLink asChild className={navigationMenuTriggerStyle()}> + <Link href={`/${domain}/chat`}> <MessageCircleIcon className="w-4 h-4 mr-1" /> Ask -</NavigationMenuLink> + </Link> +</NavigationMenuLink> ... -<NavigationMenuLink href={`/${domain}/repos`} className={navigationMenuTriggerStyle()}> +<NavigationMenuLink asChild className={navigationMenuTriggerStyle()}> + <Link href={`/${domain}/repos`}> <BookMarkedIcon className="w-4 h-4 mr-1" /> <span className="mr-2">Repositories</span> <Badge ...>...</Badge> -</NavigationMenuLink> + </Link> +</NavigationMenuLink> ... -<NavigationMenuLink href={`/${domain}/settings`} className={navigationMenuTriggerStyle()}> +<NavigationMenuLink asChild className={navigationMenuTriggerStyle()}> + <Link href={`/${domain}/settings`}> <SettingsIcon className="w-4 h-4 mr-1" /> Settings -</NavigationMenuLink> + </Link> +</NavigationMenuLink>Also applies to: 44-51, 54-66, 71-79
24-29
: Tighten active matching to path segments.
startsWith
can over-match (e.g.,/org/settings-advanced
). Consider segment-aware checks.-const isActive = (href: string) => { - if (href === `/${domain}`) return pathname === `/${domain}`; - return pathname.startsWith(href); -}; +const isActive = (href: string) => { + if (href === `/${domain}`) return pathname === `/${domain}`; + const withSlash = href.endsWith("/") ? href : `${href}/`; + return pathname === href || pathname.startsWith(withSlash); +};packages/web/src/app/[domain]/layout.tsx (1)
143-146
:headers()
andcookies()
are synchronous in App Router.No need to
await
; dropping it avoids confusion.-const headersList = await headers(); -const cookieStore = await cookies() +const headersList = headers(); +const cookieStore = cookies();packages/web/src/app/[domain]/components/navigationMenu/trialIndicator.tsx (1)
31-40
: Clamp negative days and handle pluralization.If
nextBillingDate
is past, UI can show negative days. Clamp to 0 and pluralize.-<span> - {Math.ceil((subscription.nextBillingDate * 1000 - Date.now()) / (1000 * 60 * 60 * 24))} days left in trial -</span> +{(() => { + const days = Math.max( + 0, + Math.ceil((subscription.nextBillingDate * 1000 - Date.now()) / (1000 * 60 * 60 * 24)) + ); + return <span>{days} {days === 1 ? "day" : "days"} left in trial</span>; +})()}packages/web/src/app/[domain]/components/navigationMenu/progressIndicator.tsx (1)
70-75
: Pluralize “View more” label.Minor UX polish for singular vs plural.
- {`View ${numRepos - sampleRepos.length} more`} + {`View ${numRepos - sampleRepos.length} more ${numRepos - sampleRepos.length === 1 ? "repository" : "repositories"}`}packages/web/src/app/[domain]/components/navigationMenu/index.tsx (1)
109-137
: Prefer simple anchor links for external destinations.Server Actions are overkill for static external links and add overhead. Use
<Link href=...>
or<a>
withrel="noreferrer"
where appropriate.-<form action={async () => { "use server"; redirect(SOURCEBOT_DISCORD_URL); }}> - <Button variant="outline" size="icon" type="submit"> - <DiscordLogoIcon className="w-4 h-4" /> - </Button> -</form> +<a href={SOURCEBOT_DISCORD_URL} target="_blank" rel="noreferrer"> + <Button asChild variant="outline" size="icon"> + <span><DiscordLogoIcon className="w-4 h-4" /></span> + </Button> +</a> ... -<form action={async () => { "use server"; redirect(SOURCEBOT_GITHUB_URL); }}> +<a href={SOURCEBOT_GITHUB_URL} target="_blank" rel="noreferrer"> <Button variant="outline" size="icon" type="submit"> <GitHubLogoIcon className="w-4 h-4" /> </Button> -</form> +</a>packages/web/src/initialize.ts (2)
37-39
: Config equality via JSON.stringify can be brittle.Key order/undefined differences can create false positives. Consider stable stringify or deep-equal with key sorting.
Example:
const stable = (o: unknown) => JSON.stringify(o, Object.keys(o as object).sort()); const syncNeededOnUpdate = (currentConnectionConfig && stable(currentConnectionConfig) !== stable(newConnectionConfig)) || (currentConnection?.syncStatus === ConnectionSyncStatus.FAILED);
70-89
: Deleting declarative connections: ensure referential integrity.Hard-deleting connections may violate FK constraints (e.g., repos, jobs). Wrap deletes in a transaction and confirm onDelete cascade/cleanup.
Suggestion:
await prisma.$transaction(async (tx) => { for (const c of deletedConnections) { await tx.connection.delete({ where: { id: c.id }}); } });packages/backend/src/ee/userPermissionSyncer.ts (1)
104-110
: Make shutdown resilient; close concurrently and guard against partial failures.If worker.close() throws, queue.close() is skipped. Close both with Promise.allSettled and log failures.
public async dispose() { if (this.interval) { clearInterval(this.interval); } - await this.worker.close(); - await this.queue.close(); + const results = await Promise.allSettled([this.worker.close(), this.queue.close()]); + results.forEach((r, i) => { + if (r.status === 'rejected') { + Sentry.captureException(r.reason, { tags: { component: 'user-permission-syncer', phase: 'dispose', target: i === 0 ? 'worker' : 'queue' }}); + } + }); }package.json (1)
9-9
: Ensure DB is migrated before dev services start.Previous flow ran migrations up front. If backend doesn’t run them on boot, newcomers may hit schema errors.
Options:
- Document: run
yarn dev:prisma:migrate:dev
once beforeyarn dev
.- Or add a guarded prestart: a small script that checks a migration fingerprint and prompts the user.
packages/web/src/app/[domain]/chat/layout.tsx (1)
11-11
: Remove unnecessary await on cookies().cookies() is synchronous in Next server components; drop await for clarity.
- const isTutorialDismissed = (await cookies()).get(AGENTIC_SEARCH_TUTORIAL_DISMISSED_COOKIE_NAME)?.value === "true"; + const isTutorialDismissed = cookies().get(AGENTIC_SEARCH_TUTORIAL_DISMISSED_COOKIE_NAME)?.value === "true";packages/web/src/actions.ts (2)
1840-1847
: Harden and persist the tutorial-dismissed cookie.Set path, sameSite, secure; you already set maxAge (great).
export const setAgenticSearchTutorialDismissedCookie = async (dismissed: boolean) => sew(async () => { const cookieStore = await cookies(); cookieStore.set(AGENTIC_SEARCH_TUTORIAL_DISMISSED_COOKIE_NAME, dismissed ? "true" : "false", { httpOnly: false, // Allow client-side access maxAge: 365 * 24 * 60 * 60, // 1 year in seconds + path: '/', + sameSite: 'lax', + secure: process.env.NODE_ENV === 'production', }); return true; });
1849-1853
: Persist and scope the mobile splash-screen dismissal cookie, too.Align with the tutorial cookie so it survives reloads and is site-wide.
export const dismissMobileUnsupportedSplashScreen = async () => sew(async () => { const cookieStore = await cookies(); - cookieStore.set(MOBILE_UNSUPPORTED_SPLASH_SCREEN_DISMISSED_COOKIE_NAME, 'true'); + cookieStore.set(MOBILE_UNSUPPORTED_SPLASH_SCREEN_DISMISSED_COOKIE_NAME, 'true', { + httpOnly: false, + maxAge: 365 * 24 * 60 * 60, + path: '/', + sameSite: 'lax', + secure: process.env.NODE_ENV === 'production', + }); return true; });packages/backend/src/repoCompileUtils.ts (1)
49-55
: Use POSIX joins for repoName construction (cross‑platform).path.join is OS‑dependent; on Windows it will emit backslashes and break URL-like repo names. Prefer path.posix.join for URL segments.
Example:
import path from 'path'; const repoName = path.posix.join(repoNameRoot, repoDisplayName);Also applies to: 122-131, 199-207, 266-273, 351-358, 408-412, 520-521, 645-647, 686-691
packages/backend/vitest.config.ts (1)
7-9
: Use a temp test cache dir to prevent repo pollutionData cache is currently set to
'test-data'
(relative path), which writes test artifacts to the repository root during test execution. No cleanup hooks are present in the test suite, and'test-data/'
is not in.gitignore
, so the directory persists after tests complete, polluting the working tree.Using OS temp directory (
os.tmpdir()
) is best practice for test caches and eliminates this issue.Apply this diff:
-import { defineConfig } from 'vitest/config'; +import { defineConfig } from 'vitest/config'; +import os from 'node:os'; +import path from 'node:path'; export default defineConfig({ test: { environment: 'node', watch: false, env: { - DATA_CACHE_DIR: 'test-data' + DATA_CACHE_DIR: path.join(os.tmpdir(), 'sourcebot-test-data') } } });packages/web/src/app/[domain]/repos/page.tsx (1)
11-19
: Guard against emptyjobs
array.Line 12 accesses
repo.jobs[0]
without checking if the array is empty. If a repository has no indexing jobs, this will evaluate toundefined
, which is safe but could be made more explicit.Apply this diff for clarity:
function getRepoStatus(repo: { indexedAt: Date | null, jobs: RepoIndexingJob[] }): RepoStatus { - const latestJob = repo.jobs[0]; + const latestJob = repo.jobs.length > 0 ? repo.jobs[0] : null; if (latestJob?.status === 'PENDING' || latestJob?.status === 'IN_PROGRESS') { return 'syncing'; } return repo.indexedAt ? 'indexed' : 'not-indexed'; }packages/db/prisma/schema.prisma (1)
51-53
: Add an index on Repo.indexedAt to speed periodic scans.Schedulers filter by indexedAt; index it to reduce full scans on large tables.
model Repo { // ... - indexedAt DateTime? /// When the repo was last indexed successfully. + indexedAt DateTime? /// When the repo was last indexed successfully. + + @@index([indexedAt]) }packages/backend/src/index.ts (2)
67-68
: Avoid multi-replica double-scheduling; run one scheduler cluster-wide.Starting schedulers on every replica risks duplicate job creation (even with GroupMQ). Use leader election or a distributed lock (e.g., Redis-based redlock) so only one instance runs the scheduling loop at a time. Workers can still run on all replicas.
I can wire a simple Redis lock around startScheduler if you want.
37-42
: Remove TOCTOU on mkdir; just mkdir with recursive.existsSync + mkdir is racy across replicas. Calling mkdir with { recursive: true } is idempotent.
-if (!existsSync(reposPath)) { - await mkdir(reposPath, { recursive: true }); -} -if (!existsSync(indexPath)) { - await mkdir(indexPath, { recursive: true }); -} +await mkdir(reposPath, { recursive: true }); +await mkdir(indexPath, { recursive: true });packages/backend/src/repoIndexManager.ts (4)
246-264
: Avoid per-job process signal listeners; risk MaxListeners warnings at high concurrency.Register one handler in the constructor that aborts all active jobs. Track controllers in a Set; add/remove per job.
export class RepoIndexManager { - private interval?: NodeJS.Timeout; + private interval?: NodeJS.Timeout; private queue: Queue<JobPayload>; private worker: Worker<JobPayload>; + private activeJobControllers = new Set<AbortController>(); + private onSignal = () => { + logger.info('Received shutdown signal, aborting active jobs...'); + for (const c of this.activeJobControllers) c.abort(); + }; constructor( // ... ) { // ... + process.on('SIGTERM', this.onSignal); + process.on('SIGINT', this.onSignal); }And in runJob:
- const abortController = new AbortController(); - const signalHandler = () => { - logger.info(`Received shutdown signal, aborting...`); - abortController.abort(); - }; - process.on('SIGTERM', signalHandler); - process.on('SIGINT', signalHandler); + const abortController = new AbortController(); + this.activeJobControllers.add(abortController); try { // ... } finally { - process.off('SIGTERM', signalHandler); - process.off('SIGINT', signalHandler); + this.activeJobControllers.delete(abortController); }Also remove listeners in dispose if you add them in constructor:
public async dispose() { if (this.interval) clearInterval(this.interval); await this.worker.close(); await this.queue.close(); + process.off('SIGTERM', this.onSignal); + process.off('SIGINT', this.onSignal); }
279-289
: Redundant git-repo validation; call once with signal.isPathAValidGitRepoRoot is called twice; use a single call with signal.
- if (existsSync(repoPath) && !(await isPathAValidGitRepoRoot( { path: repoPath } ))) { - const isValidGitRepo = await isPathAValidGitRepoRoot({ - path: repoPath, - signal, - }); + if (existsSync(repoPath)) { + const isValidGitRepo = await isPathAValidGitRepoRoot({ path: repoPath, signal }); if (!isValidGitRepo && !isReadOnly) { logger.warn(`${repoPath} is not a valid git repository root. Deleting directory and performing fresh clone.`); await rm(repoPath, { recursive: true, force: true }); } - } + }
391-397
: Make cleanup completion idempotent.The second duplicate CLEANUP job will throw on repo.delete. Use deleteMany and log outcome.
- const repo = await this.db.repo.delete({ - where: { id: jobData.repoId }, - }); - - logger.info(`Completed cleanup job ${job.data.jobId} for repo ${repo.name} (id: ${repo.id})`); + const res = await this.db.repo.deleteMany({ where: { id: jobData.repoId } }); + if (res.count > 0) { + logger.info(`Completed cleanup job ${job.data.jobId} for repoId ${jobData.repoId}`); + } else { + logger.info(`Cleanup job ${job.data.jobId}: repoId ${jobData.repoId} already deleted`); + }
363-367
: Parallelize shard deletions and avoid full-dir scans if directory is large.
- Deleting sequentially is slow; use Promise.allSettled for better throughput.
- Consider storing shard paths or segregating per-org/repo dirs to avoid scanning the full INDEX_CACHE_DIR.
- for (const file of files) { - const filePath = `${INDEX_CACHE_DIR}/${file}`; - logger.info(`Deleting shard file ${filePath}`); - await rm(filePath, { force: true }); - } + await Promise.allSettled( + files.map((file) => { + const filePath = `${INDEX_CACHE_DIR}/${file}`; + logger.info(`Deleting shard file ${filePath}`); + return rm(filePath, { force: true }); + }) + );packages/web/src/app/[domain]/repos/columns.tsx (1)
112-162
: Status filter uses labels; works, but consider value-based filters to avoid i18n coupling.Filtering by display labels ties logic to UI text. Prefer filtering by the enum value and map labels only for rendering.
- const uniqueLabels = Object.values(statusLabels); - const currentFilter = column.getFilterValue() as string | undefined; + const uniqueValues = Object.keys(statusLabels) as RepoStatus[]; + const currentFilter = column.getFilterValue() as RepoStatus | undefined; ... - {uniqueLabels.map((label) => ( - <DropdownMenuItem key={label} onClick={() => column.setFilterValue(label)}> + {uniqueValues.map((value) => ( + <DropdownMenuItem key={value} onClick={() => column.setFilterValue(value)}> - <Check className={cn("mr-2 h-4 w-4", column.getFilterValue() === label ? "opacity-100" : "opacity-0")} /> - {label} + <Check className={cn("mr-2 h-4 w-4", column.getFilterValue() === value ? "opacity-100" : "opacity-0")} /> + {statusLabels[value]} </DropdownMenuItem> ))} ... - const status = row.getValue(id) as RepoStatus; - return statusLabels[status] === value; + const status = row.getValue(id) as RepoStatus; + return value === undefined ? true : status === value;packages/backend/src/git.ts (1)
149-161
: LGTM; consider explicit local scope for clarity.Writes look correct. Optionally pass an explicit local scope to avoid ambiguity across environments.
...well this PR escalated quickly 👀
TL;DR: solved some repo indexing concurrency issues by moving from BullMQ to GroupMQ, refactored how we represent indexing jobs in the db, and generally deleted a ton of things that got in the way of web perf.
Where it started
We were getting bug reports (#523, #436) that repo indexing would fail at the clone or fetch stage with strange git errors. After some investigation on a test deployment of ~1k repos, we noticed that the root cause seemed to be that multiple indexing jobs could run on the same repository at the same time. When this happens, two workers will attempt to perform git operations on the same folder, resulting in the weird git behaviour we were seeing. For example:
repoid=1
repoid=1
/repos/1
/repos/1
<-- now two git processes are operating on the same repo. bad news -->
Q: Why were two workers being scheduled for the same repository in the first place?
A: I'm still not certain, but here's my theory: The job scheduler uses the the database in order to schedule new indexing jobs. This happens in two operations: 1) fetch all repositories that need indexing, and 2) create BullMQ jobs for these repositories and flag them as indexing. We were not performing 1 and 2 in a transaction, so this was not an atomic operation. I noticed that failures would typically happen in our k8s cluster when a new replica was being spun up. Likely, we were getting interleavings between replica A and B something like
A:1, B:1, A:2, B:2
, resulting in duplicate jobs for the same set of repos.Put another way, we were using the Postgres database as a distributed lock to enforce the "1 repo per worker" invariant. This felt error prone - what we really want is the invariant to be enforced at the queue level within Redis.
BullMQ has the concept of groups:
This sounds like the perfect solution: each repository can be in it's own group such that we are guaranteed that two jobs operating on the same repository must be processed one-by-one. Unfortunately, this feature is only available in the commercial version of BullMQ, and taking a dependency on a commercially licensed upstream dependency felt like a non-starter.
Enter groupmq - it's a brand new library (literally on v1.0.0) built by OpenPanel.dev. It supports grouping, has a similar api as BullMQ, and is MIT licensed. Was slightly hesitant to use such a new library, but their docs look good, so I decided to move to that.
repoIndexManager.ts
contains the new & improved indexer. In addition to moving to GroupMQ, I've also moved to representing job status in a separateRepoJob
table in s.t., we can maintain a 1:1 mapping between a job in Redis and how we represent it in the database. This PR deprecates therepoIndexingStatus
field.Where it went
After updating the indexer and moving to using the
RepoJob
table, I had to refactor how the progress indicators, repo carousel, and repo table work. This led into a bunch of chore work in web:/connections
viewFixes #523
Fixes #436
Fixes #462
Fixes #530
Fixes #452
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes